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Abstract —Part I (ij of this paper discusses the prohlem of 
learning the operational structure of the grid from nodal voltage 
measurements. In this work (Part II), the learning of the opera¬ 
tional radial structure is coupled with the prohlem of estimating 
nodal consumption statistics and inferring the line parameters in 
the grid. Based on a Linear-Coupled (LC) approximation of AC 
power flows equations, polynomial time algorithms are designed 
to complete these tasks using the availahle nodal complex voltage 
measurements. Then the structure learning algorithm is extended 
to cases with missing data, where availahle observations are 
limited to a fraction of the grid nodes. The efficacy of the 
presented algorithms are demonstrated through simulations on 
several distrihutlon test cases. 

Index Terms —Power Distrihutlon Networks, Power Flows, 
Struture/graph Learning, Load estimation. Parameter estimation. 
Voltage measurements. Transmission Lines, Missing data. 


I. Introduction 

The present power grid is separated into different tiers 
for optimizing its operations and control, namely the high 
voltage transmission system and the medium and low voltage 
distribution system. The distinction between these systems 
extends to their operational structure; the transmission system 
is a loopy graph while the distribution system operates as 
a radial network (set of trees). The larger volume of power 
transferred and higher magnitudes of resident voltages in the 
transmission network as compared to the distribution network 
have led grid security and reliability studies to focus primarily 
on the transmission side. Traditionally, the distribution grid 
has thus suffered from low placement of measurement devices 
leading to negligible real-time observation and control efforts 

In Part I Q of this paper, we study the design of low- 
complexity algorithms for learning the operational radial struc¬ 
ture of the distribution grid despite available metering limited 
to nodal voltages. In this work, we extend the study to the 
problem of estimating other features of the distribution grid to¬ 
gether with learning the operational structure. Specifically, we 
utilize available node complex voltages to learn the statistics 
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of load profiles at the grid nodes and to estimate the complex¬ 
valued impedance parameters of the operational distribution 
lines. It is worth noting that line/edge based metering (line flow 
and breaker status measurements) are considered unavailable 
as they are seldom observed in real time in today’s grids. 
Next, we extend the problem of learning the grid structure 
introduced in Part I to the case with partial observability, where 
voltage measurements pertaining to a subset of the nodes are 
not observed. In essence, the results from this work can aid 
several areas that have gained prominence with the expansion 
of smart grid. These include failure identification grid 
reconfiguration Q, power flow optimization and generation 
scheduling 0^ 0-0> as well as privacy preserving grid 
operation 0 . Furthermore, learning under partial observability 
enables the quantification of measurement security necessary 
to prevent adversarial learning aimed at hidden topological 
attacks |[8), 0. 

‘Graph Learning’ or ‘Graphical Model Learning ’ m is 
a broad area of work that has been considered in different 
domains. In general graphs, maximum-likelihood has been em¬ 
ployed for learning graph structures im-Gi) through convex 
optimization as well as greedy techniques. In a learning study 
specific to general power grids IE) presents a maximum likeli¬ 
hood structure estimator (MLE) based on electricity prices. For 
radial distribution grids, the authors of m discuss structure 
learning through construction of a spanning tree based on the 
inverse covariance matrix (or concentration matrix) of voltage 
measurements, while 0 studies topology identification with 
Gaussian loads through a maximum likelihood scheme. 

In Part I 0. an approach that uses provable trends in 
second moments of nodal voltage magnitudes to learn the 
grid structure was presented. Our algorithm design in part I 
assumes that all nodal loads are, in expectation, consumers 
of active and reactive power which is realistic for most, if 
not all, current distribution grids. Here in part 11, we use a 
modified but not conflictive assumption of independence of 
fluctuations in active and reactive loads at different nodes. 
As shown below, under this assumption one is not only able 
to reconstruct the grid structure but also able to infer either 
the statistics of active and reactive loads at every node or 
the values of impedance parameters at every operational line. 
Then, we show how to extend our structure learning algorithm 
to cases with missing data, where observations from a subset 
of nodes are not available to the observer. Similarly to Part I, 
the algorithms in here (Part II) are independent of the exact 
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probability distribution of load profiles as well as variations 
in values of line parameters and are thus applicable to a wide 
range of operational conditions. 

The rest of this manuscript (part II) is organized as follows. 
Section |n] contains a brief review of the radial structure of the 
grid, approximations of power flows and sets formulation of 
problems considered. Section III contains proofs of our main 
results on second moments of voltage measurements in radial 
grids. Section IV describes the algorithm design to learn the 
operational structure and estimate the statistics of load power 
profiles in the grid. An extension is also discussed for the 
problem of structure learning coupled with estimation of line 
impedances (instead of injection statistics). In Section |V] we 
present Algorithm 2 that learns the operational radial structure 
in the presence of missing observations. Simulations results for 
our Algorithms on test radial distribution cases are presented 
in Section |VI| Finally, conclusions are discussed in Section 
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II. Technical Preliminaries 

This Section provides a brief description of the operational 
structure of the distribution grid, and introduces the learning 
problems considered in Part II. We then have a brief reminder 
about the Linear Coupled Power Flow (LC-PF) model (already 
introduced and discussed in Part I) that we rely on for analysis 
in later Sections. 

Structure of Radial Distribution Network; A distribution 
grid is represented by a graph S = (V, £), where V (of size 
N + K) is the set of nodes/buses and £ is the set of undirected 
edges/transmission lines. The complete layout of S is loopy, 
but its operational layout (denoted by T) derived by excluding 
open/non-operational lines is a union of K non-intersecting 
trees. Each grid tree T). in T comprises of a single substation 
feeding electricity into load nodes lined along the ‘radial’ tree. 
Thus, IT is a ‘base-constrained spanning forest’ with N 
non-substation nodes. See Fig. 1 in Part i|g for an illustrative 
example. The set of operational edges that contribute to the 
structure of the forest T is denoted by £^ where £^ C £. We 
follow the same notation as Part I and described in Table I of 

0 . 

Summary of Learning Problems: The majority of distri¬ 
bution grids operational today are handicapped by limited real 
time metering for breaker statuses and power flows 0, as well 
as infrequent updating of model parameters. The grid operator 
(utility company) or an external observer/adversary in such a 
scenario is concerned with the following three tasks: 

(1) To learn the current configuration of switches that deter¬ 
mine the ‘base-constrained spanning forest’. 

(2) To learn the statistics of the power consumptiorQ profiles 
at the nodes. 

(3) To learn the values of resistances and reactances of each 
operational line of the distribution system. 

For all these tasks, the utility or observer relies on available 
nodal complex voltage (magnitude and phase) readings. Task 

'We use the term ‘power injection’, ‘power consumption’ and ‘load’ 
interchangeably to denote power profile at each interior (non-substation) node 
of the distribution system. 


(1) is coupled with either Task (2) or Task (3) and considered 
first in the situation of full observability, when complex voltage 
(magnitude and phase) samples are available at all the nodes of 
the system. In fact, we show that voltage magnitude samples 
are sufficient to learn the grid structure (Task (1)), additional 
voltage phasor measurements are needed for the inference 
problems in Tasks (2) and (3). However, we also discuss 
Task (1) independently in the situation where several nodes 
do not offer any voltage readings. The problem formulations 
considered in Part I previously and in Part II are summarized 
in Table U 

The physics of Power Flows (PFs) in T forms the back¬ 
ground for the learning/reconstruction problems sketched here. 
Variety of PF models/approximations were discussed in details 
in Appendix 1 and Section IIIA-C of Part I 0. Let us 
briefly recap essential features of the Linear-Coupled Power 
Flow (LC-PF) model essential for analysis presented in the 
following Sections, also extending it with some new notations. 

Linear Coupled Power Flow (LC-PF): Let and 
denote the diagonal matrices representing, respectively, line 
resistances and reactances for operational edges in forest T. 
Let V X 1 real valued vectors p,q,e and 0 denote the active 
power injections, reactive power injections, voltage magnitude 
deviations and voltage phasors at the non-substation nodes, 
respectively. The LC-PF model is given by the matrix Eqs (5,6) 
of Part I, where, Hg and are edge-weighted reduced graph 
Laplacian matrices (after removing sub-station/slack buses) for 
forest IT with edge weights given by the edge conductances 
and susceptances respectively. M is the reduced directed 
incidence matrix with each row corresponding to a directed 
edge (ab) in £^. In fact, M is block diagonal with M = 
diag(Mi,M 2 ,--- ,Mk), where each block (M,) corresponds to 
a tree T, in T. Assuming that p and q in Eqs. (5,6) of Part I 
are fluctuating, we derive the following relations involving the 
means px, and covariance matrices Q.xy = E[(x —px)(y — 
for variables x and y. 


Pe — Hi/[pp H^ , Pe—H^ -f ^^pq 

He -//,7 

^ X/x H X/x X/r H X/r X/x hh X/r 


X jr y/-' 1 jx 


n, = -f Hflp.qH-1 -f HflaqqHfl 


X/r v/-' X/r 


- Hf/£lqH-/ -P Hf/QlqqH-/ 


Xjx 


Xjx 


x/r hH x/r 


( 1 ) 


( 2 ) 


(3) 


(4) 


(5) 


It is worth mentioning that inclusion of both line resistances 
and reactances in the LC-PE model distinguishes it from the 
DC power flow models m that has limited applicability 
in distribution grids. In the next Section, we derive key 
results relating second moments in phase angles and voltage 
magnitudes in the LC-PE for a radial distribution grid. Versions 
of all subsequent results can be generated for DC power flow 
models by simply ignoring line resistances or reactances as 
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TABLE I 

Summary of Learning Problems/Statements 


Observations 

available 

Prior Information 

Assumptions 

Features estimated 

Results used 

Voltage magnitudes 
of all nodes 

True second moment of nodal 
power injections, resistance and re¬ 
actance of edges 

Non-negative second moments of 
nodal power injections 

Operational network struc¬ 
ture 

Algorithm 1 in Part I 

Voltage magnitudes 
of all nodes 

None 

Uncorrelated nodal power injec¬ 
tions 

Operational network struc¬ 
ture (Task (1)) 

Theorem [IJ Theorem [2] 
Algorithm 1 

Voltage magnitudes 
and phasors of all 
nodes 

Resistance and reactance of edges 

Uncorrelated nodal power injec¬ 
tions 

Mean and variance of nodal 
power injections (Task (2)) 

Lemma Algorithm 1 

Voltage magnitudes 
and phasors of all 
nodes 

True variance of nodal power in¬ 
jection 

Uncorrelated nodal power injec¬ 
tions 

Resistance and reactance of 
operational lines (Task (3)) 

Lemma Algorithm 1 

Voltage magnitudes 
of subset of nodes 

True variance of nodal power injec¬ 
tions, resistance and reactance of 
edges 

Uncorrelated nodal power injec¬ 
tions, Missing nodes separated by 
three or more hops 

Operational network struc¬ 
ture 

Theorem yj Lemma [2] Al¬ 
gorithm 2 



Fig. 1. Schematic layout of a distribution grid tree The sub-station node 
represented by large red node is the slack bus. (a) Dotted lines represent the 
paths from nodes a and d to the slack bus. Here, = ri,,, + reo. (b) 

Here, nodes a and c are descendants of node a. 


demonstrated in Part I. 

III. Second Moments of Voltages in Radial Grids 

Consider a tree with reduced incidence matrix Mk- Let 
£fl denote the unique path from node a to the slack bus of the 
tree where path between two nodes refers to the unique set 
of edges connecting them. As shown in Part I |[T], in a radial 
distribution gird, has the following structure, 

= Y^M-\ajy {fJ)M-\bJ) 
f 

if 

1^0 otherwise. 

Let Da'‘ denote the set of descendants of node a within the 
tree where b is called a descendent of a, if a lies on the 
(unique) path from b to the slack bus of 0);.. We include a itself 
in the set of its descendants. Similarly, we call b the parent of 
a within if a is an immediate descendant of b as illustrated 
in Fig |l(b)| 

The following statement holds (see Lemma 1 in |[T| for 
detailed proof). 


Lemma 1. For two nodes, a and its parent b, in tree ‘Jk 





if node c € D J* 
otherwise. 


(7) 


Before the discussion of our results on trends in voltage 
covariances, we make the following assumption on the co- 
variances of load consumption profiles. 

Assumption 1: Powers at different nodes are not correlated, 
while active and reactive powers at the same node are posi¬ 
tively correlated. Thus, {1,...,V} 

£l^p{a,a) > 0, D.p{a,b) = 0.q{a,b) = D.qp{a,b) = 0 

Few remarks are in order. First, the assumption of in¬ 
dependence of fluctuations is realistic in general, reflecting 
diversity of individual consumer behavior on relatively short 
time scales. Second, unless consumer-level control of reactive 
power is implemented GZl is implemented, fluctuations in 
active and reactive consumption/generation at the same node 
will have a strong tendency to align, giving positive correla¬ 
tion. Since, Assumption 1 pertains to covariances (‘centered’ 
second moments), it does not run counter to the assumption 
in Part I, where ‘non-centered’ second moments of power 
injections are considered to be positive. In fact, nodal loads 
(consumers of active and reactive power) satisfy both the 
assumptions given in Part I and Part II. Note that Assumption 
1 does not restrict individual nodal loads to follow any specific 
distribution. 

The following result states that covariances of voltage 
magnitude deviations increase as we move farther away from 
the root of any tree in the grid. 

Theorem 1. If node a f b is a descendant of node b on tree 
7k in forest 7, then Q,g{a,a) > ^^{b,b). 

Proof: Q.£ is given by Eq. 0 with four non-negative 
terms on the right side. Let the first term be 

denoted by Hg. For one-hop neighbors, node a and its parent 
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b, we use Lemma to get 

Q.l{a,a) - Q.l{a,b) = ^ H^^^{a,c)Q.p{c,c)rab>0 ( 8 ) 

al{a,b)-al{b,b)= ^ H-'{b,c)apic,c)rab>0 (9) 

CCeDa ^ 

Combining the inequalities, we get i2g(a,a) > Q.l{b,b). Ex¬ 
tending the same analysis to the remaining three terms in 
Eq. ([^ and then moving from one-hop neighbors to descen¬ 
dants proves the theorem. ■ 

Next, we focus on the term E[(ea —/7£„) — (e^ — 
which is the expected value of the squared centered difference 
between two node voltage deviations (e). Eor any two nodes 
a and b that lie on tree T^, we have 

E[(e« --“eJ - (efo = Q.e{a,a)-Q.E{a,b) 

+ Q.E{b,b) -Q.E{b,a) 

where He is composed of four terms as given by Eq. 0- Using 
Eq. for each of the four terms within He and adding them, 
we derive 


E[(ea -;UeJ - i^b-^t^h)? = 

celk 

+ {H-^l{a,c)-H-^l{b,c))^Q.p{c,c)+2(H-^l(a,c)-H-^l{b,c)^ 

(^Hyl{a,c) - H-^[{b,c)j Q.ppic,c) (10) 

Eor the special case where node b is the parent of node a, 
using Lemma in Eq. ([T0|i, we obtain 


Lemma 2 . If b is a’s parent in tree 7k, 


ceol*- 

“1“ '^^ab^ab^pqi^ 1 (11) 

E[(efl-peJ-(06-^0,)]^= Y 4b^pic,c) + rli,Q.p{c,c) 

ceoj'^ 

'^^ab^ab^pqipt^^ (12) 

m^a- Me, - £6 + J (Q« - MQa -^b-\-MQb)] = 

Y rabXab (£ip (c, c) - (c, c)) -f {xly - rlP)Q.pq{c,c) (13) 

ceol'^ 


Eqs. ( [T2| [T3| ) can derived through the same analysis as 
one leading to Eq. (111. Note that for each equation in 
Lemma the right side contains power covariance terms 
originating from the nodes in dJ* alone. Thus, if the 
covariances of all descendants c f a G dJ'‘ are known, 
Eqs. ( 1 1|12|13 1 can be used to infer the three covariance 
quantities {Q.p{a,a),Q.p{a,a),Q.pq{a,a)) associated with node 
a. Eurthermore, parameters (rab,Xab) included in these equa¬ 
tions pertain to the single operational line {a,b). Eor the case 
where injection covariances Q.p,Q.g are known from historical 
data, we can thus estimate the parameters of line {a,b) as 
well as Q.pq{a,a), the covariance between active and reactive 
injections at node a. We use these facts later in the text while 
designing our learning algorithms. 

Next, we prove an important inequality involving the mag¬ 
nitude of E[(ea — Peo) — (£fo — jUe;,)]^ on the grid nodes. 



Fig. 2. Schematic layout of a distribution grid tree The sub-station node 
represented by large red node is the slack bus. represents the set of nodes 
that are descendants of node a. (a) Node a is a descendant of node b, while 
node b is a. descendant of node c. (b) Node a and c are descendants of node 
b along disjoint sub-trees. 


Lemma 3. For distinct nodes a, b and c that belong to the 
same tree 7k, E[{ea -PeJ - {eb -< E[{ea -- {Ec- 
Pec)]^ holds for the following cases: 

1) Node a is a descendant of node b and node b is a 
descendant of node c (see Fig. |2(a)| , 

2) Nodes a and c are descendants of node b and the path 
from a to c passes through node b (see Fig. \2(b)^ . 

Proof: L et us first prove the Lemma for the Case 1. As 


shown in Eig. 2(a) one observes dJ’’ C dJ'’ C dJ'’. Eurther, 
eJ'‘ — C — fij*, where represents edges traversed 
along the path leading from node a to the root of %. Consider 
a node d in the tree T^. When d € one uses (|^ to derive 

Hfl,[a,d)-Hf^^{b,d)= Y>'cf 

^ Hflr^a,d)-Hfl^{b,d) <Hfl^{a,d)-Hfl^{c,d) (14) 
Similarly, for node d £ dJ* one obtains 


H^Ua,d) - HMb,d) = Yref 


< 


Y^<^f 


(c/)G£,‘n£7-£;'= (e/)G£„‘^n£;‘'-£, 

Hflr^a,d)-Hfl^{b,d) <Hfl^{a,d)-Hfl^{c,d) (15) 

Eor d G dJ'‘ we aiiives at 

Hf/ria,d)-Hf^l_{b,d)=0< Y^cf 

(e/)G£^n£j‘'-£j* 

^ HflM,d)-Hf^^{b,d) <Hfl^{a,d)-Hfl^{c,d) (16) 
Next, using Eqs. ( |14|15|16} , we arrive at 

\/d G D^fHfl^{a,d)-Hfl^{b,d) < Hfl^{a,d)-Hf^^{c,d) 

(17) 

yd f, Dl'^,HflJya4)-Hfl^{b,d), Hfl^{a,d)-Hfl^{c,d) = 0 

(18) 

Similar inequalities hold for Flfj^ as well. We can now apply 

TTfp _//_ ^_ (e, _ ’‘e/jJ 


Eqs. ( |17|18| to Eq. ([^ to prove E[(ea -PeJ- (e^ -^eJ]^ < 
Efea^NhJ - (s-c -plj^] for Case 1. 
























5 


In the case 2 (see Fig. |2(b)| l nodes a and c are descendants 
of node b. Let be the penultimate (second to the last) node 
lying on the path from a to c, and be the penultimate node on 
the path from c to b. Here, and are disjoint subsets 
of D^'‘. Then, for any da € and dc € dJJ^, observe that 
~ ^'b’’■ results in 

H:^^^{b,da) -H^^^{a,da) = H^^^{c,da) -H:^^^{a,da) (19) 

H;l^{b,dc)-H-^^{a,dc) = 0 < H-^^{c,dc)-H;l^{a,dc) ( 20 ) 

Furthermore, for d ^ 


H, 


\/l{b,d) -H^^l{a,d) = 0 = H^^l{c,d) -H^^l{a,d) ( 21 ) 


Versions of Eqs. ( 

19 

20 

21 

1 for H 

way. Using these resu 

Its 

in Eq. 


-1 

'i/i 


can be derived in a similar 


10 one arrives at E[(ea — 


A'eJ-Ceft-A'eJ] <E[(ea-A'Ej-(ec-iUeJ] for Case 2. This 
completes the proof. ■ 

The following theorem follows directly from Lemma 


Theorem 2. For every node a with set of descendants Da'‘ 
and parent b, b ^ argmin - (Ec 


Proof: In the case 2 of Lemmathe optimal node for 
argmin E[(ea -Pe„) - {^c-|d^,)r^^>^ists on the path from 
node a to the root. Considering case 1, one hnds that the 
optimal node on that path is node a’s parent b. ■ 

Theorem implies that among all non-descendants of a 
node, the minimum expected squared centered difference of 
voltage magnitude deviations is achieved at its parent node. 
Indeed in the next Section, we utilize this result to identify a 
node’s parent. 


Algorithm 1 Base Constrained Spanning Forest Learning with 
Estimation of Load Statistics:_ 

Input: ni phase angle and voltage deviation observations 0^ 
and e,, 1 < 7 < ni, all line resistances r and line reactances x 
Output: Covariance Matrices Q.p, Q.^ and Q.pq, mean vectors 
Pp and 


1 : Compute pQ^ = e^i/m,ne(a,a) = 

'Lj=iQM/'n-pl^ and He (a, a) = L"Li £«£«/'«for 
all nodes a. 

2: Undiscovered Set U -fr- {1,2,...,A}, Leaf Set L ^ (|), 
Descendant Covariance vectors ^0,D‘> -^r- 0, DP‘i ^ 0. 
3: while {U 7 ^ (|)) do 
4: h* ^max/,e(/nE(h,h) 

5: for all a G L do 

6 : if b* = argmince(/I7=i “-“eJ - (Ec -A'eJ]V'« 

then 


7: 

8 : 

9: 

10 : 

11 : 

12 : 

13: 

14: 

15: 

16: 

17: 

18: 


Draw edge between nodes a and b* 

Solve Eqs. ( 1 1|12 131 to get Q.p{a,a), fi^(a,a) 
and Q.pq(a,a) 

DP{b*)^DP{b*)+FLp{a,a)+DP{a) 

DP{b*)^DP{b*)FQ.q{a,a)+D‘i{a) 

DPP{b*) 4- DP‘i{b*)+Plpq{a,a)FDPP{a) 

F 4— L — {^} 

end if 
end for 

L4-LU{h*} 

end while 


Generate Flijx and from edges 

Solve pe = i/j i^pp — i/j i^_pq , p£ = i^pp + //j i^pq 


IV. Learning Grid Structure with Estimation of 
Load or Parameters 

We hrst present our algorithm design for Tasks 1 and 2, 
structure learning coupled with estimation of nodal power 
injection statistics. Next, we look at solving for Tasks 1 and 3, 
structure learning coupled with estimation of line parameters. 

A. Learning Structure and Injection Statistics 

The results of the previous Section (specihcally. Theorem 
[2 Lemma and Theorem provide the machinery for the 
algorithm design. Algorithm 1 learns the radial operational 
structure (Task 1) as well as estimates the mean pp and 
covariance Lip of the power injections at the load nodes (Task 
2). The observer here is assumed to be aware of the load 
nodes that are connected directly to the grid sub-stations. This 
is necessary as the assignment of substations, one per tree 
in forest T cannot be uniquely determined. This occurs due 
to the assumption of zero fluctuations of voltage magnitude 
and phase at substations which makes the relations involving 
voltage deviations in the previous section insensitive when 
the substation is the parent node. Resistance and reactance 
parameters of all lines (open and operational) are assumed 
known here. 

Algorithm Overview: In each iteration, the node b* with 
the highest variance in voltage deviation among the yet 


undiscovered node set U is selected in Step Theorem [T] 
ensures that selecting nodes in the decreasing order of their 
variances leads to discovery of node b* only after all its 
descendants have been discovered previously. Set L denotes 
the current set of leaves (previously discovered nodes with 
unknown parents). In Step the selected node b* is made 
the parent of a node in set L if the condition in Theorem 
is satished. Here, each entry in the descendant covariance 
vectors DP,DP and DPP contains the sum of load power 
covariances over all descendants of each node, other than 
the node itself. The values of covariance matrices of power 
injections for b* are inferred in Step [fusing Lemma Steps 


12 and 15 are used to update the current set of leaves L 


for use in the next iteration. Einally, in Step 18 the mean 
of nodal power is computed using the measurement matrix 
H constructed from the grid structure. Note that instead of 
learning the covariances in Lip sequentially through Step 
one can use the generated measurement matrix H directly to 
learn them together at the end. 


Algorithm Complexity: Computing empirical covariance 
matrix of voltage deviation is considered to be a part of pre¬ 
processing and is thus ignored in the complexity estimation. 
One makes N iterations to select all the non-substation nodes. 
Within each iteration, an edge selection (Step calls for a 
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check with each node in L. Thus, the worst-case complexity 
for learning the structure is 0{N^). Computing the means 
and the covariances is of complexity 0{N^) through matrix 
multiplication. 

Observe that learning the forest structure in Algorithm 1 
relies on voltage magnitude deviation measurements alone, 
and in fact does not require knowledge of line parameters in 
the grid. Phase measurements and values of line resistance 
and reactance are needed only to estimate the means and 
covariances of power injections. 



B. Learning Structure and Line Parameters 

The first goal of the observer here is the same as in Section 


IV-A - to learn the operational grid structure. However, we 


consider a modified scenario where the covariances for active 
and reactive nodal injections (fip and f2^) are already known 
from historical data and thus do not need to be estimated. 
Instead, the observer here aims at estimating the impedance 
parameters and Xab) for each operational line {a,b) within 
the grid. Consider Eqs. ( 1 1|12 T3] ). If matrix Llpq is also 
known, the observer can easily solve these linear equations 
with rab, Xab and (rabXab) as the three unknowns to estimate 
the impedance for each operational edge. However, Llpq may 
be harder to obtain in reality as its computation requires 
time-synchronized historical samples of active and reactive 
injections. If information on Clpa is unavailable, variables 


rab,Xab and Q.pq{a,a) form three nonlinear Eqs. (11 12 13 i for 


each edge {a,b). Note that Algorithm 1 infers the radial grid 
structure iteratively upward from the descendant nodes to the 
parents. Therefore, we also infer line parameters {rab,Xab) and 
Llpq{a,a) by solving Eqs ( 1 1|12|13 1 in each iteration for the 
newly discovered edge {a,b) between node a and its parent b 
in tree ‘Ik- Let A, B and C denote the expressions on the left 
side of Eqs. (|11|12|13'| respectively. Erom Eqs. (|1 l|12|i, we 


derive, 


+4b = i^ + B)/ ^p{c,c)+ ilq{c'c)j . 


We 


can now eliminate terms involving Xab and Q.pq to get Eq. (22 1 
which is a quadratic expression in We use it to infer rab 
and Xab- To infer Q.pq{a,a), we use values of Q.pq{c,c) for 
descendants c{^a) of node a that are determined in previous 
iterations. 

Every step in this algorithm, except modified Step cor¬ 
responds to respective step in Algorithm 1. The Step is 
modified such that Eqs. ( |22] l, followed from ( [TT] l, are used 
to derive the line parameters and Q-pq. As this algorithm 
formulation and analysis follows Algorithm 1, we omit it for 
brevity. In the next Section, we discuss a critical extension 
of the structure learning problem (Task 1) to the case where 
the available nodal data is incomplete due to some missing 
entries. 


V. Learning Base-Constrained Spanning Eorest 
WITH Missing Data 

The structure learning problem discussed in the preceding 
Section (Task (1)) requires the observer to have voltage mag¬ 
nitude data for all nodes within the distribution grid. However, 
this may not be true in practice. In fact, loss of communication 
and/or synchronization troubles with meters over short periods 


(a) (b) 

Fig. 3. Schematic layout of a distribution grid tree with missing node 
c. The sub-station node, shown as the large red node, is the slack bus. (a) 
Missing node c is a leaf with parent a. (b) Missing node c is an intermediate 
node with parent node a and grandparent node b. 


of time, along with meter breakdowns over longer time-scales, 
can result in missing data over a set M of missing nodes 
in the system. We assume here that the “missing” nodes are 
positioned within the grid not fully arbitrarily, but they satisfy 
the following property. 

Assumption 2: Missing nodes in set M are separated by 
greater than two hops in the distribution grid forest and they 
are not immediate children (not first descendants) of the sub¬ 
station nodes. This assumption implies that there exists no 
observed node which is connected to more than one missing 
node. Note that a missing node can exist in either of the two 
possible configurations - a leaf or an intermediary position 
- as illustrated in Eig. Assumption 2 guarantees that in 
either case, both the parent and grandparent (parent of the 
parent) nodes of the missing node are observed. Additionally, 
unlike structure learning in Task (1), in this section we assume 
that information, e.g. estimated or originating from historical 
measurements, on the actual values of Lip, Liq and Lipq 
covariance matrices and impedances of all lines is available. 
We now construct Algorithm 2 to learn the operational grid 
structure in the presence of a missing set M with nodes whose 
voltage magnitude deviations are unknown. 

Algorithm Overview: The construction of each operational 
tree begins by picking node b* with the largest value of 
covariance in the voltage deviation (Step[^ and then advancing 
along the Algorithm sequentially. Here, the current leaf set L 
denotes the set of discovered nodes with yet unknown parents. 
Eor every node a in L, we observe its set of unconnected 
descendants Da- Here Da is empty if all of a’s non-leaf 
children (immediate descendants) are known and have been 
linked to it. Note that a may be a parent to a missing leaf 
node despite Da being empty. Thus, if Da is empty, first Step 
[^checks if the selected node b* is the parent to node a with 
all children discovered by using Eq. 11 If no link is found, 
then Step 20 checks if node b* and node a are connected 
with a missing leaf node c linked to a in the configuration 
shown in Eig. |3(a)| If still no link is found, the Algorithm 
stores a as an unconnected descendant of b* in Db*- On the 
other hand, if Da is non-empty, the algorithm confirms, in Step 
28 existence of a missing intermediate node c with parent 
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{A + B)2(A^n„(c,c)-i}^n,(c,c))2 (A^ap(c,c)-B^£l,(c,c))(A2-B2) 

^ ^ (Vn,(.,c)+n,(c,.)r (Efi.(^,^)H 


node a and grandparent node b* in the configuration shown 
in Fig. 3(b)| One of these three checks is guaranteed to find 
an edge due to Assumption 3. Following this, a new node 
is selected in the next iteration. The Algorithm completes 
when the set U becomes empty. Since no child (immediate 
descendant) of substation nodes are missing (Assumption 3), 
U — 1 ^ implies inclusion of all the missing nodes into the grid 
structure (M = (|)). 

Algorithm Complexity: Following the complexity analysis 
of Algorithm 1, we estimate, counting number of possible 
comparisons, that the worst case complexity of the Algorithm 
2 is e)((A-|M|)|M|). 


VI. Experiments 

We test the performance of Algorithms 1 and 2 on three 
distribution grid test cases m listed in Table and described 
in detail in Part 10 - 


TABLE II 

Summary of the tested distribution grids 


Test 

Case 

Number of buses / substa¬ 
tions / tie-switches 

Additional Non- 
operational lines 

Source 

bus 13 3 

13/3/3 

10 

19 


bus 29 l 

29/1/1 

20 

pH 


bus 83 l 1 

83/11/13 

30 

pi 



For each experiment here, we pick an operational spanning 
forest layout T from the loopy grid graph S of a test system by 
opening the additional lines as well as the tie-switches. For this 
configuration, we choose statistics of consumption at each load 
node (we consider Gaussian for all experiments) and use it to 
generate multiple samples of nodal power injection. For each 
vector-valued sample, we fix voltages at the substations and 
run power flows to derive voltage magnitudes and phases at 
every node. Then, we compute empirical correlation functions 
of phases and voltages, averaging over all the generated 
samples. Finally, a valid observation set is created by hiding 
all the operational information other than what is required as 
input. Then, we run our algorithms and compare the resulting 
reconstruction with the actual operational case. 

We start by simulating Algorithm 1. For brevity, we present 
results on learning the grid structure with inference of load 
statistics (and not inference of line impedance parameters). 
Here, the observer has access to phases and voltage magni¬ 
tudes at all the nodes as input. As noted in Table voltage 
magnitudes are sufficient for reconstructing the grid structure, 
but inference of load statistics require both voltage magnitude 
and phase measurements. Fig. |4(a)| and Fig. |4(b)| show the 
change in the average fractional errors for estimating means 
and covariance of the nodal injections with increasing number 
of samples for the three test systems considered. For both 


estimated quantities, the fractional errors are stated in terms 
of the difference between the true and estimated values relative 
to the true values. It is clear from the Figures that the average 
fractional error decays exponentially with the number of 
samples. Comparing Fig. 4(c) with Figs. |4(a)] and 4(b) we see 
that the number of samples required to accurately reconstruct 
the topology is much less than for reconstructing the nodal 
power distributions. Only when the number of samples is less 
than 100 does the reconstruction of the topology begin to 
suffer. 

Next, we turn to discussing Algorithm 2 that learns the 
grid structure from voltage magnitude measurements at a 
subset of the grid nodes. The actual covariance of the nodal 
injections of active and reactive powers is assumed known 
to the observer in this case. As described previously, we 
generate samples of the active and reactive injections, run 
power flows to generate samples of voltage magnitudes, but 
then erase samples before passing them to the observer. We 
study average fractional errors in learning the grid structure 
as a function of number of measurement samples. Note that 
averaging here is over both selection of the missing nodes and 
statistics of nodal injections. Fig. 5(a)[ Fig. |5(b) and Fig. 5(c) 


show results for bus_13_3, bus_29_l and fiMs_83_ll models 
respectively. Different curves within each Figure are generated 
using different number of missing nodes. As expected, the 
number of errors increases with increase in the number of 
missing nodes. The decay in the average fractional errors is 
exponential with increase in the number of samples. 


VII. Conclusions 

We have considered three critical problems in radial distri¬ 
bution grids: learning the operational radial structure (Task 1), 
inferring the nodal load statistics (Task 2), and estimating the 
impedance parameters of operational lines (Task 3). In Part I 
0 , we have presented a polynomial time algorithm that uses 
nodal voltage magnitude samples, and available information 
on nodal injection statistics and line parameters to accomplish 
Task 1. The algorithm is based on the assumption of second 
moments (of nodal power injections) positivity. In Part II, we 
have assumed independence of fluctuations in nodal injections 
instead and used it to develop a new polynomial time algorithm 
that solves Task 1 coupled with either Task 2 or Task 3. 
Importantly, under our modified assumption, voltage magni¬ 
tude measurements appear sufficient to learn the operational 
radial grid, even in the absence of any information on the line 
parameters or injection statistics. Availability of the additional 
voltage phasor measurements is required to complete Tasks 
2 and 3. Then, we have presented the second algorithm 
to estimate the grid structure for systems with incomplete 
observability, where voltage magnitude measurements for a 
set of missing nodes are not available. It is worth mentioning 
































Algorithm 2 Base Constrained Spanning Forest Learning with Missing Data 

Input: True and Q.g, m voltage deviation observations e,, \ < j <m for nodes in set M, all line resistances r and line 
reactances x. Missing nodes Set: M 


1 : Compute — Y!J=\^alni, and fie(a,a) — for all observed nodes a. 

2 : Undiscovered Set U ^ {1,2, ...,N + K}, Leaf Set L <— (|), Unconnected Descendant Sets Da ^ (j)V nodes a, Child Active 
and reactive Covariance vectors ^0, ^ 0 

3: while (U ^ (|)) do 
4: b* ■(^ma\h^uQ.£{b,b) 

5: for all fl e L do 

6 : if b* =argmincGC/L7=i[(ea-iU£j-(ec-A'eJ]^/»J then 

7: a Da then 


ifE 




= rli,{Q.p{a,a) +D^(a)) +xli,{Q.g{a,a) + D‘> (a)) +2rabXabi^pq{a,a) 


9 

10 

11 

12 

13 

14 

15 

16 

17 

18 

19 

20 
21 
22 

23 

24 

25 

26 

27 

28 

29 

30 

31 

32 

33 

34 

35 

36 

37 

38 


+DP‘i{a)) then 

Draw edge between nodes a and b* 

DPib*)<-DPib*)+Dp{a,a)+DP{a), D^iib*) ^ D‘>{b*)+Dg{a,a)+D‘i{a) 

DP‘l{b*)^DP‘i{b*)+D.pg{a,a)+DP‘>{a) 

L i — L — {ri} 

else 

if 3£/ e M such that L"Li[(ea-peJ - (e^. -|^^|,,)?/m=xll^{D.p{a,a)+DP{a)+D.p{d,d)) 

+rlf,{Q.g{a,a) +D‘<{a)+Q.g{d,d))+2rabXabiD.pg{a,a) +DP‘i{a)+Q.pg{d,d)) then 
Draw edges between nodes a and b*, and a and d 

DP{b*)^DP{b*)+D.p{a,a)+DP (a) +D.p{d,d), D‘>{b*) ^ D‘i{b*)+Dg{a,a)+D‘> (a) + Q.g{d,d) 
DP‘i{b*) ^ DP‘i{b*)+D.pg{a,a)+DP‘i{a)+D.pg{d,d) 

L^L-{a}, 

else 

Db* <~Db*U{a}, DP{b*)^DP{b*)+Q.p{a,a)+DP{a), D^b*) ^ D‘>{b*)+Q.g{a,a)+D‘>{a) 

DP‘i{b*) ^ DP‘i{b*)+D.pg{a,a)+DP‘i{a) 

L 3— L — {^{ 

end if 
end if 
else 

Find d€M such that [(£«-pej - (£&* - /m = xl^{D.p{a,a) + DP{a)+D.p{d,d)) 

+rlf,{Q.g{a,a) +D‘>{a)+Q.g{d,d))+2rabXab{D.pg{a,a) +DP‘i{a)+Q.pq{d,d)) 

Draw edges between nodes a and b*, and nodes in Da and d 

DP{b*DP{b*)Pr Dp{a,a)+DP{a)+ Dp{d,d), D^i{b*) D‘i{b*)+Dg{a,a)+D‘i{a)+Dg{d,d) 

DP<i(b*) ^ DP‘i{b*)+Dpg{a,a) +DP‘i{a)+Dpg{d,d) 

L^L-{a), 

end if 
end if 
end for 

L4-lu{^*} 

end while 


that the assumptions in Parts I and II of this paper, though 
different, simultaneously hold true for several realistic grids 
and time-scales. Moreover, neither assumption relies on any 
particular distribution for nodal injections. Performance of 
both Algorithms have been elucidated through simulations of 
a number of grid test cases. Apart from using these results 
to detect failures and also to improve load control, this work 
has key implications in related areas of non-intrusive control 
and quantification of measurement security and prevention of 
adversarial attacks. Learning the grid structure under general¬ 
ized power flow models and related error analysis remain two 


interesting directions for future work. 
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Fig. 4. Average fractional errors vs number of samples used in Algorithm 1 
for learning statistics of nodal injections and grid structure using Algorithm 1. 
(a) Means of nodal power injection, (b) Covariances of nodal power injection, 
(c) Grid (forest/tree) reconstruction. The number of samples used for graph 
(forest/tree) reconstruction is moderate in comparison to the numbers used to 
estimate statistics. 


(b) 



number of measurement samples 

(c) 

Fig. 5. Accuracy of Algorithm 2 in learning the distribution grid structure 
vs number of samples in the presence of missing data for the test cases of 
(a) model bus_l3_3, (b) model bus_29_l, and (c) model bus_H3_ll. 
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